-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: introduce mutex for state and lastCommitInfo to avoid race conditions #22692
base: main
Are you sure you want to change the base?
Conversation
…betwwen Commit and CreateQueryContext
📝 Walkthrough📝 WalkthroughWalkthroughThe pull request introduces several enhancements and bug fixes across multiple modules within the Cosmos SDK. Key changes include the addition of features for simulating nested messages, a new Linux-only backend for the crypto/keyring module, and improvements to integration tests. Concurrency control is enhanced through the introduction of mutex locks to prevent race conditions in state management. The changelog has been updated to reflect these changes, and new test cases have been added to ensure the stability of the application under concurrent conditions. Changes
Assessment against linked issues
Possibly related PRs
Suggested labels
Suggested reviewers
Warning There were issues while running some tools. Please review the errors and either fix the tool’s configuration or disable the tool if it’s a critical failure. 🔧 golangci-lint (1.62.2)level=warning msg="[runner] Can't run linter goanalysis_metalinter: buildir: failed to load package sqlite3: could not load export data: no export data for "github.com/bvinc/go-sqlite-lite/sqlite3"" Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🧹 Outside diff range and nitpick comments (13)
go.mod (1)
218-219
: Consider documenting the temporary nature of this replace directive.Since this replace directive is part of the race condition fix, consider moving it to the temporary replace section with a comment explaining its purpose.
// Here are the short-lived replace from the Cosmos SDK // Replace here are pending PRs, or version to be tagged -// replace ( -// <temporary replace> -// ) +replace ( + // Temporary: Using local store module for race condition fix (#22650) + cosmossdk.io/store => ./store +)CHANGELOG.md (4)
Line range hint
1-1
: Add version table at the top of changelogConsider adding a version table at the top that summarizes all v0.46.x releases with their dates and key changes for easier reference.
Line range hint
11-13
: Fix inconsistent bullet point formattingThe bullet points use a mix of
*
and-
. Standardize on one format throughout the document for consistency.
59-61
: Add impact details for breaking changesFor breaking changes like the Dragonberry security fix, consider adding more details about potential impact on users and required actions.
Line range hint
1-1000
: Fix typos and grammatical errorsSeveral typos and grammatical errors found throughout the document:
- "typographical" misspelled as "typograhical"
- Missing periods at end of some bullet points
- Inconsistent capitalization in section headers
store/rootmulti/store.go (1)
63-63
: Naming convention for mutex variables.Consider renaming
lastCommitInfoMut
tolastCommitInfoMu
to align with Go conventions, where mutex variables are typically suffixed withMu
.baseapp/baseapp.go (2)
127-127
: Consider renamingstateMut
tostateMutex
for clarity.According to the Uber Go Style Guide, abbreviations in variable names should be avoided to enhance readability. Renaming
stateMut
tostateMutex
makes the purpose of the mutex clearer.Apply this diff to improve clarity:
- stateMut sync.RWMutex + stateMutex sync.RWMutex
Line range hint
583-586
: Potential deadlock due to inconsistent lock ordering betweenapp.mu
andapp.stateMut
.In
getContextForTx
,app.mu
is locked before callingapp.getState(mode)
, which in turn locksapp.stateMut
. If elsewhereapp.stateMut
is locked before attempting to acquireapp.mu
, this could lead to a deadlock due to inconsistent lock ordering.Recommend reviewing the locking order to ensure that
app.mu
andapp.stateMut
are always acquired in a consistent order throughout the codebase to prevent potential deadlocks.baseapp/abci.go (3)
609-609
: Handle possible errors fromCacheContext
In the assignments:
ctx, _ = app.getState(execModeFinalize).Context().CacheContext()Ignoring the second return value without handling could overlook potential issues. Although
CacheContext()
may not currently return an error, future changes could introduce errors.Consider capturing both return values explicitly for clarity:
ctx, ms := app.getState(execModeFinalize).Context().CacheContext() // use ms if neededAlso applies to: 689-689
747-747
: Clarify the comment for better understandingThe comment at line 747 is unclear:
// only used to handle early cancellation, for anything related to state app.getState(execModeFinalize).Context()
Consider rephrasing for clarity:
// Use ctx only for early cancellation handling. For state-related operations, use app.getState(execModeFinalize).Context()
1194-1194
: Review the use ofCacheContext
In:
ctx, _ = app.getState(execModeFinalize).Context().CacheContext()Ensure that caching the context here is intended and that any modifications do not affect the main context unintentionally.
baseapp/abci_test.go (2)
2783-2830
: Add a function comment to describe the purpose of the testTo improve code readability and maintainability, consider adding a comment before the
TestABCI_Race_Commit_Query
function to explain its purpose and what it is testing.
2796-2797
: Simplify atomic counter initializationSince the zero value of
atomic.Uint64
is zero, the explicit call tocounter.Store(0)
is unnecessary. You can simplify the code by removing this call.Apply the following diff:
-counter := atomic.Uint64{} -counter.Store(0) +var counter atomic.Uint64
📜 Review details
Configuration used: .coderabbit.yml
Review profile: CHILL
📒 Files selected for processing (7)
CHANGELOG.md
(1 hunks)baseapp/abci.go
(21 hunks)baseapp/abci_test.go
(2 hunks)baseapp/baseapp.go
(7 hunks)baseapp/test_helpers.go
(1 hunks)go.mod
(1 hunks)store/rootmulti/store.go
(5 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
CHANGELOG.md (1)
Pattern **/*.md
: "Assess the documentation for misspellings, grammatical errors, missing documentation and correctness"
baseapp/abci.go (1)
Pattern **/*.go
: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.
baseapp/abci_test.go (2)
Pattern **/*.go
: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.
Pattern **/*_test.go
: "Assess the unit test code assessing sufficient code coverage for the changes associated in the pull request"
baseapp/baseapp.go (1)
Pattern **/*.go
: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.
baseapp/test_helpers.go (1)
Pattern **/*.go
: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.
store/rootmulti/store.go (1)
Pattern **/*.go
: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.
🔇 Additional comments (20)
baseapp/test_helpers.go (2)
47-47
: LGTM: Thread-safe state access implemented
The change from direct state access to using getState
aligns with the PR's objective of preventing race conditions. This modification ensures thread-safe access to the state during simulations.
Let's verify the thread safety implementation:
✅ Verification successful
Thread-safe state access confirmed via RLock/RUnlock
The verification confirms that getState
method properly implements thread-safety using RLock
/RUnlock
from stateMut
. The implementation correctly protects state access across different execution modes, including the execModeFinalize
used in the test helper.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify that getState method uses proper synchronization
# Expected: Find mutex usage in getState implementation
ast-grep --pattern 'func (app *BaseApp) getState($_) $_ {
$$$
}'
Length of output: 831
53-58
: Thread-safe state access implemented, verify performance impact
The changes correctly implement thread-safe state access using getState
for both check and finalize modes. This aligns with fixing the race conditions identified in issue #22650.
Let's verify the usage patterns and potential performance impact:
✅ Verification successful
Based on the search results, I can now provide a final response since I have enough information about the usage patterns of NewContextLegacy
:
Thread safety changes are safe with minimal performance impact
The changes to NewContextLegacy
using getState
are safe because:
- The method is primarily used in test code and initialization paths, not in hot execution paths
- Main usage patterns are in:
- Test helpers and test suites
- App initialization and genesis setup
- Export functionality
- Simulation framework
The added synchronization through getState
won't cause performance issues since this method isn't called in performance-critical paths or high-frequency operations.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for high-frequency calls to NewContextLegacy that might be affected by the added synchronization
# Expected: Identify critical paths that might need optimization
# Find direct calls to NewContextLegacy
rg "NewContextLegacy\(" -A 2
# Find potential high-frequency callers through NewContext
rg "NewContext\(" -A 2
Length of output: 40662
go.mod (1)
219-219
: LGTM! Verify version consistency with other local modules.
The addition of the store module replace directive aligns with the existing pattern and is necessary for implementing the race condition fix. However, we should ensure version consistency.
✅ Verification successful
Let me verify the version consistency further by checking other local module replacements.
Version consistency verified across local module replacements
The replace directive for the store module follows the same pattern as other local module replacements (api, bank, staking, tx). All local modules use Go 1.23, and the version specified in the main go.mod (v1.1.1-0.20240418092142-896cdf1971bc) is consistent with the development nature of these changes.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check version consistency between go.mod and local store module
# Verify the store module's go.mod version matches the version specified in the main go.mod
# Get the version from main go.mod
echo "Main module store version:"
grep "cosmossdk.io/store" go.mod | head -n 1
# Get the version from local store module
echo "Local store module version:"
if [ -f "store/go.mod" ]; then
grep "module cosmossdk.io/store" store/go.mod
fi
Length of output: 329
Script:
#!/bin/bash
# Check all local module replacements and their versions
echo "Main go.mod versions:"
grep "cosmossdk.io/" go.mod | grep -v "replace" | sort
echo -e "\nReplace directives:"
grep "replace cosmossdk.io/" go.mod | sort
echo -e "\nLocal module versions:"
for dir in api bank staking tx store; do
if [ -f "$dir/go.mod" ]; then
echo -e "\n$dir/go.mod:"
grep "^module" "$dir/go.mod"
grep "^go 1." "$dir/go.mod"
fi
done
Length of output: 1573
store/rootmulti/store.go (7)
63-63
: Introduction of lastCommitInfoMut
for concurrency control.
Adding lastCommitInfoMut sync.RWMutex
to the Store
struct appropriately introduces a mutex to prevent race conditions when accessing lastCommitInfo
.
292-294
: Proper locking around lastCommitInfo
assignment in loadVersion()
.
Locking lastCommitInfoMut
before updating rs.lastCommitInfo
ensures thread-safe write operations.
440-446
: Consistent use of LastCommitInfo()
in LatestVersion()
.
Updating LatestVersion()
to use rs.LastCommitInfo()
ensures safe concurrent access to lastCommitInfo
.
448-451
: Thread-safe access to lastCommitInfo
via LastCommitInfo()
method.
The new LastCommitInfo()
method correctly implements read locking to allow safe concurrent reads of lastCommitInfo
.
456-474
: Safe access to lastCommitInfo
in LastCommitID()
.
Using rs.LastCommitInfo()
in LastCommitID()
prevents data races during read operations of lastCommitInfo
.
518-521
: Ensure minimal locking when updating lastCommitInfo.Timestamp
.
Holding the lock only during the assignment to lastCommitInfo.Timestamp
is acceptable. Confirm that no other operations within the lock could cause delays or potential deadlocks.
800-802
: Thread-safe retrieval of lastCommitInfo
in Query()
.
Using rs.LastCommitInfo()
in the Query()
method ensures safe concurrent access and prevents race conditions.
baseapp/abci.go (9)
72-72
: Good use of encapsulation with app.getState
The introduction of app.getState(execModeFinalize)
enhances state management by encapsulating state retrieval, improving code maintainability and readability.
78-78
: Ensure req.ConsensusParams
is validated
While storing consensus parameters, make sure that req.ConsensusParams
is not nil and contains valid data to prevent potential runtime errors.
90-97
: Consistent Context Update for States
Updating both checkState
and finalizeState
contexts ensures consistency across different execution modes. This is a good practice for maintaining state integrity.
110-110
: Initialization of Infinite Gas Meter
Setting an infinite gas meter during InitChain
is appropriate to allow unrestricted operations during genesis. Ensure that this does not inadvertently persist into states where gas limits should be enforced.
936-939
: Ensure proper mutex usage when modifying shared state
Locking app.stateMut
before modifying app.finalizeBlockState
is correct. Ensure that all other accesses to shared state variables are similarly protected to prevent race conditions.
1024-1026
: Mutex locking and unlocking
Proper use of app.stateMut.Lock()
and app.stateMut.Unlock()
ensures thread-safe operations when modifying shared states.
1047-1047
: Ensure ms.Write()
is safe and necessary
Calling ms.Write()
writes pending changes to the parent store. Confirm that this call is necessary and that any potential errors are handled appropriately.
1409-1409
: Verify consensus parameters retrieval
Retrieving consensus parameters using app.getState(execModeFinalize).Context()
should be thread-safe and consistent with the application's state at this point.
978-979
:
Accessing finalizeState
without synchronization
Accessing finalizeState
here may lead to race conditions if not properly synchronized. Verify that app.stateMut
is held during this access.
Run the following script to identify unsynchronized accesses to app.getState
:
Also applies to: 983-983
baseapp/abci_test.go (1)
2796-2797
: Verify Go version compatibility for atomic.Uint64
The atomic.Uint64
type is available from Go 1.19 onwards. Ensure that the project specifies Go 1.19 or newer in its configuration to avoid compatibility issues.
Run the following script to check the Go version specified in the go.mod
file:
✅ Verification successful
Go version compatibility verified for atomic.Uint64
The project uses Go 1.23.2, which fully supports atomic.Uint64
(introduced in Go 1.19). No compatibility issues exist.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check that the project requires Go 1.19 or newer.
# Look for the 'go' directive in the go.mod file
grep "^go " go.mod
Length of output: 30
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
store/rootmulti/store.go (1)
514-522
: Consider combining the mutex locks.The code currently acquires and releases the mutex twice in quick succession. Consider combining these operations to reduce lock overhead:
- rs.lastCommitInfoMut.Lock() - rs.lastCommitInfo = cInfo - rs.lastCommitInfoMut.Unlock() - - rs.lastCommitInfoMut.Lock() - rs.lastCommitInfo.Timestamp = rs.commitHeader.Time - rs.lastCommitInfoMut.Unlock() + rs.lastCommitInfoMut.Lock() + rs.lastCommitInfo = cInfo + rs.lastCommitInfo.Timestamp = rs.commitHeader.Time + rs.lastCommitInfoMut.Unlock()
📜 Review details
Configuration used: .coderabbit.yml
Review profile: CHILL
📒 Files selected for processing (1)
store/rootmulti/store.go
(5 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
store/rootmulti/store.go (1)
Pattern **/*.go
: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.
🔇 Additional comments (4)
store/rootmulti/store.go (4)
63-63
: LGTM: Good choice of RWMutex for lastCommitInfo synchronization.
Using sync.RWMutex is appropriate here as it allows concurrent reads while ensuring exclusive writes, which aligns with the access patterns of lastCommitInfo.
292-294
: LGTM: Appropriate lock scope in LoadVersion.
The mutex lock is correctly scoped to only protect the critical section (lastCommitInfo assignment), minimizing lock contention.
448-451
: LGTM: Well-implemented thread-safe accessor.
The method correctly uses RLock/RUnlock for concurrent read access to lastCommitInfo. The defer statement ensures proper unlock even in case of panics.
440-446
: LGTM: Consistent usage of thread-safe accessor.
All methods consistently use the thread-safe LastCommitInfo accessor, properly addressing the race condition between Commit and CreateQueryContext operations.
Also applies to: 456-474, 802-804
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (4)
store/rootmulti/store.go (4)
63-63
: Excellent use of atomic.Pointer for thread safety.Using atomic.Pointer instead of mutex locks is a more efficient solution for this use case since the CommitInfo content itself is not modified concurrently, only the pointer is updated atomically.
This approach provides better performance than mutex locks as it:
- Avoids lock contention
- Allows for lock-free reads
- Provides atomic pointer updates
438-443
: Consider caching the atomic load result.Multiple accesses to lastCommitInfo.Load() could be optimized by storing the result in a local variable.
-lastCommitInfo := rs.lastCommitInfo.Load() -if lastCommitInfo == nil { - return GetLatestVersion(rs.db) -} - -return lastCommitInfo.Version +if info := rs.lastCommitInfo.Load(); info == nil { + return GetLatestVersion(rs.db) +} else { + return info.Version +}
510-512
: Optimize multiple atomic loads.Multiple consecutive calls to lastCommitInfo.Load() should be consolidated to a single load.
-cInfo := rs.lastCommitInfo.Load() -cInfo.Timestamp = rs.commitHeader.Time - +if cInfo := rs.lastCommitInfo.Load(); cInfo != nil { + cInfo.Timestamp = rs.commitHeader.Time +}
791-793
: Consolidate atomic load operation.The atomic load result should be cached to avoid multiple loads and null checks.
-lastCommitInfo := rs.lastCommitInfo.Load() -if res.Height == lastCommitInfo.Version { - commitInfo = lastCommitInfo +if info := rs.lastCommitInfo.Load(); info != nil && res.Height == info.Version { + commitInfo = info
📜 Review details
Configuration used: .coderabbit.yml
Review profile: CHILL
📒 Files selected for processing (1)
store/rootmulti/store.go
(9 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
store/rootmulti/store.go (1)
Pattern **/*.go
: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.
🔇 Additional comments (1)
store/rootmulti/store.go (1)
12-12
: LGTM: Required import for atomic operations.
The addition of "sync/atomic" import is necessary for the atomic pointer implementation.
@beer-1 can you resolve conflict for this? We'd like to merge :) |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Head branch was pushed to by a user without write access
sorry for late update, just did conflict fix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
CHANGELOG.md (4)
1-2
: Add missing version information in titleThe title should include the version number for better clarity.
- # Changelog + # Changelog (v0.50.x)
39-41
: Consider adding more context to unreleased sectionThe unreleased section could benefit from additional context about the scope of changes and expected release timeline.
## [Unreleased] + This section tracks all notable changes that will be part of the next release. + Target release date: TBD + Every module contains its own CHANGELOG.md. Please refer to the module you are interested in.
1634-1635
: Add missing security advisory detailsThe security advisory reference should include more details about the vulnerability and mitigation steps.
This release fixes security vulnerability identified in x/bank. + + ## Security Advisory Details + - Vulnerability: [Brief description] + - Impact: [Description of potential impact] + - Mitigation: [Steps taken to fix] + - Affected versions: [List of versions]
3156-3158
: Add version cutoff date for previous releasesThe reference to previous releases should specify the version cutoff date for better clarity.
## Previous Releases - [CHANGELOG of previous versions](https://github.com/cosmos/cosmos-sdk/blob/c17c3caab86a1426a1eef4541e8203f5f54a1a54/CHANGELOG.md#v0391---2020-08-11) (pre Stargate). + [CHANGELOG of previous versions](https://github.com/cosmos/cosmos-sdk/blob/c17c3caab86a1426a1eef4541e8203f5f54a1a54/CHANGELOG.md#v0391---2020-08-11) (pre Stargate, versions before v0.40.0).
📜 Review details
Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (4)
client/v2/go.sum
is excluded by!**/*.sum
go.sum
is excluded by!**/*.sum
server/v2/cometbft/go.sum
is excluded by!**/*.sum
simapp/v2/go.sum
is excluded by!**/*.sum
📒 Files selected for processing (6)
CHANGELOG.md
(1 hunks)baseapp/test_helpers.go
(1 hunks)client/v2/go.mod
(2 hunks)go.mod
(4 hunks)server/v2/cometbft/go.mod
(2 hunks)simapp/v2/go.mod
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- client/v2/go.mod
- simapp/v2/go.mod
- server/v2/cometbft/go.mod
- go.mod
- baseapp/test_helpers.go
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.md`: "Assess the documentation for misspellings, gramm...
**/*.md
: "Assess the documentation for misspellings, grammatical errors, missing documentation and correctness"
CHANGELOG.md
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: tests (00)
- GitHub Check: test-system-v2
- GitHub Check: Analyze
- GitHub Check: Summary
🔇 Additional comments (2)
CHANGELOG.md (2)
4-11
: LGTM! Well-structured guiding principlesThe guiding principles section clearly outlines the key aspects of maintaining a good changelog, making it easy for contributors to understand the requirements.
13-34
: LGTM! Clear usage instructionsThe usage instructions provide a comprehensive guide on how to add changelog entries, including the required format and stanza types. The examples and explanations are clear and helpful.
Hey @aljo242 can you check this again? |
Description
Closes: #22650
Author Checklist
All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.
I have...
!
in the type prefix if API or client breaking changeCHANGELOG.md
Reviewers Checklist
All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.
Please see Pull Request Reviewer section in the contributing guide for more information on how to review a pull request.
I have...
Summary by CodeRabbit
New Features
ethsecp256k1
.Improvements
Bug Fixes